[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

giresg · 2020-07-12T11:50:11Z

This PR fixes a bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings. This is meant to work (as referenced in the documentation) and does work on all derived classes of LGBMModel except LGBMClassifier. The issue is in the following code in the master branch:

LightGBM/python-package/lightgbm/sklearn.py

Lines 785 to 798 in 1e2013a

    
           if self._n_classes > 2: 
        
               # Switch to using a multiclass objective in the underlying LGBM instance 
        
               ova_aliases = {"multiclassova", "multiclass_ova", "ova", "ovr"} 
        
               if self._objective not in ova_aliases and not callable(self._objective): 
        
                   self._objective = "multiclass" 
        
               if eval_metric in {'logloss', 'binary_logloss'}: 
        
                   eval_metric = "multi_logloss" 
        
               elif eval_metric in {'error', 'binary_error'}: 
        
                   eval_metric = "multi_error" 
        
           else: 
        
               if eval_metric in {'logloss', 'multi_logloss'}: 
        
                   eval_metric = 'binary_logloss' 
        
               elif eval_metric in {'error', 'multi_error'}: 
        
                   eval_metric = 'binary_error'

with the fix the following line works as expected (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=['fair', 'error'], **params_fit)

StrikerRUS

@gramirezespinoza LGTM! Thank you very much for the fix! Just one minor comment below.

StrikerRUS · 2020-07-12T18:19:42Z

tests/python_package_test/test_sklearn.py

@@ -922,3 +922,14 @@ def test_continue_training_with_model(self):
        self.assertEqual(len(init_gbm.evals_result_['valid_0']['multi_logloss']), 5)
        self.assertLess(gbm.evals_result_['valid_0']['multi_logloss'][-1],
                        init_gbm.evals_result_['valid_0']['multi_logloss'][-1])
+
+    def test_eval_metrics_lgbmclassifier(self):


Will it make sense to include the content of this test into already existing huge unit test for metrics as an enhancement for the case of ... invalid multiclass metric is replaced with binary alternative ...

LightGBM/tests/python_package_test/test_sklearn.py

Line 503 in 2792923

def test_metrics(self):

LightGBM/tests/python_package_test/test_sklearn.py

Lines 749 to 753 in 2792923

# invalid multiclass metric is replaced with binary alternative for custom objective

gbm = lgb.LGBMClassifier(objective=custom_dummy_obj,

**params).fit(eval_metric='multi_logloss', **params_fit)

self.assertEqual(len(gbm.evals_result_['training']), 1)

self.assertIn('binary_logloss', gbm.evals_result_['training'])

Or just please move this new test closer to that one because they are both about the same things.

@StrikerRUS yes definitely makes sense. I've moved the test inside of test_metrics but placed it next to the section

non-default metric with multiple metrics in eval_metric

as it is highly related to that test. Thanks!

StrikerRUS · 2020-07-13T12:52:38Z

tests/python_package_test/test_sklearn.py

+        X_classification, y_classification = load_breast_cancer(True)
+        params_classification = {'n_estimators': 2, 'verbose': -1, 'objective': 'binary', 'metric': 'binary_logloss'}
+        params_fit_classification = {'X': X_classification, 'y': y_classification,
+                      'eval_set': (X_classification, y_classification), 'verbose': False}


Please fix one linting error:

./tests/python_package_test/test_sklearn.py:547:23: E128 continuation line under-indented for visual indent

Thank you! Sorry for the inconvenience, but could you please rebase to the latest master? I've just fixed R-package builds: #3225. Those failures prevents from merging.

Rebased, hopefully I did it correctly as it is my first time rebasing an upstream branch 😄

Thanks!
But seems that something went wrong: see number of changed files

Maybe you can allow editing from maintainers and then I'll be able to help with cleaning up the rebasing?
https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/allowing-changes-to-a-pull-request-branch-created-from-a-fork

editing for maintainers is enabled. Sorry for this.

No problem! Seems I was able to exclude excess files.

github-actions · 2023-08-24T12:19:44Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

giresg requested review from chivee, guolinke, henry0312, jameslamb, Laurae2, StrikerRUS and wxchan as code owners July 12, 2020 11:50

giresg mentioned this pull request Jul 12, 2020

[python][scikit-learn] Support for multiple evaluation metrics #3165

Closed

StrikerRUS added the fix label Jul 12, 2020

StrikerRUS approved these changes Jul 12, 2020

View reviewed changes

StrikerRUS reviewed Jul 13, 2020

View reviewed changes

jameslamb mentioned this pull request Jul 13, 2020

add rules for tests/ in CODEOWNERS #3226

Merged

German I Ramirez-Espinoza added 4 commits July 14, 2020 22:06

Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier

d57ecd1

Move bug-fix test to the test_metrics unit-test

8df1e5b

Fix test to avoid issues with existing tests

7d71a83

Fix coding-style error

5bcc9ec

StrikerRUS force-pushed the fix-lgbmclassifier-eval-metric branch from 0c53735 to 5bcc9ec Compare July 14, 2020 15:56

StrikerRUS merged commit 7b8b515 into microsoft:master Jul 14, 2020

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

giresg commented Jul 12, 2020

StrikerRUS left a comment

StrikerRUS Jul 12, 2020 •

edited

Loading

giresg Jul 13, 2020

StrikerRUS Jul 13, 2020

giresg Jul 13, 2020

StrikerRUS Jul 13, 2020

giresg Jul 14, 2020

StrikerRUS Jul 14, 2020

giresg Jul 14, 2020

StrikerRUS Jul 14, 2020

github-actions bot commented Aug 24, 2023

	if self._n_classes > 2:
	# Switch to using a multiclass objective in the underlying LGBM instance
	ova_aliases = {"multiclassova", "multiclass_ova", "ova", "ovr"}
	if self._objective not in ova_aliases and not callable(self._objective):
	self._objective = "multiclass"
	if eval_metric in {'logloss', 'binary_logloss'}:
	eval_metric = "multi_logloss"
	elif eval_metric in {'error', 'binary_error'}:
	eval_metric = "multi_error"
	else:
	if eval_metric in {'logloss', 'multi_logloss'}:
	eval_metric = 'binary_logloss'
	elif eval_metric in {'error', 'multi_error'}:
	eval_metric = 'binary_error'

	# invalid multiclass metric is replaced with binary alternative for custom objective
	gbm = lgb.LGBMClassifier(objective=custom_dummy_obj,
	params).fit(eval_metric='multi_logloss', params_fit)
	self.assertEqual(len(gbm.evals_result_['training']), 1)
	self.assertIn('binary_logloss', gbm.evals_result_['training'])

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier #3222

Conversation

giresg commented Jul 12, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

StrikerRUS Jul 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 24, 2023

StrikerRUS Jul 12, 2020 •

edited

Loading